30        Bioinformatics

A warning sign is displayed if the observed distribution deviates from normal distribu-

tion by a sum of more than 15% of the reads. A failure sign will be displayed if the distribu-

tion deviates by a sum of more than 30% of reads.

1.5.7  Per Base N Content

During the sequencing process, a base is called with a high confidence. However, for some

fault, the machine may fail to call any base at a specific position. The “N” character is then

placed at that position as an indication of call failure. A few call failures are tolerable;

however, if the frequency of “N” is high, that may pose a quality problem. The per base N

content graph shows the distribution of “N” at each base position. The N percentages are

plotted in the y-axis against positions in the x-axis. A warning is issued if any position

shows an N content of greater than 5% and a failure sign if any position shows an N content

of greater than 20%. Figure 1.21 shows the per base N content with no problem.

1.5.8  Sequence Length Distribution

In the library preparation step, DNA molecules are cut into equal fragments to generate

reads with equal lengths. Most sequencing instruments run quality control to keep the

FIGURE 1.21  Per base N content.